r 



(19) 



J 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 



(12) 



(43) Date of publication: 

09.03.2005 Bulletin 2005/10 

(21) Application number: 04017686.9 

(22) Date of filing: 26.07.2004 



(n) EP 1 513 321 A2 

EUROPEAN PATENT APPLICATION 

(51) Int CI 7; H04L 29/08, H04L 12/56 



(84) Designated Contracting States: 


• Fan, Kan F. 


AT BE BG CH CY CZ DE DK EE ES Fl FR GB GR 


Diamond Bar, CA 91765 (US) 


HU IE IT LI LU MC NL PL PT RO SE SI SK TR 


• Lindsay, Steven 


Designated Extension States: 


Mission Viejo, CA 92692 (US) 


AL HR LT LV MK 


• McDaniel, Scott S. 




Villa Park, CA 92861 (US) 


(30) Priority: 29.08.2003 US 652270 






(74) Representative: Jehle, Volker Armin, Dipl.-lng. 


(71) Applicant: Broadcom Corporation 


Bosch, Graf von Stosch, Jehle, 


Irvine, California 92618-7013 (US) 


Fluggenstrasse 13 




80639 Munchen (DE) 


(72) Inventors: 




• Elzur, Uri 




Irvine, CA 92606 (US) 





CM 
< 

CO 
CO 

LO 



Q. 
LU 



(54) System and method for TCP/IP offload independent of bandwidth delay product 



(57) Aspects of the invention may provide TCP of- 
fload, which may include acquiring TCP connection var- 
iables from a host and managing at least one TCP con- 
nection using the acquired TCP connection variables. 
At least a portion of the acquired TCP connection vari- 
ables may be updated and at least some of the updated 
TCP connection variables may be transferred back to 
the host. In an aspect of the invention, the TCP connec- 
tion variables may be variables that are independent of 
bandwidth delay product. At least a portion of the updat- 
ed TCP connection variables may be utilized by the host 
to process the TCP connection or another TCP connec- 
tion. The host may push the variables onto the stack and 
the TOE may pull the variables from the stack. Also, up- 
dated TCP connection variables may be pushed on the 
stack by the TOE and pulled from the stack by the host. 
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Description 

CROSS-REFERENCE TO RELATED APPLICATIONS/ 
INCORPORATION BY REFERENCE 

[0001] This application makes reference to, claims 
priority to and claims benefit from: United States Provi- 
sional Patent Application Serial No. 60/408,617, entitled 
"System and Method for TCP/IP Offload" filed on Sep- 
tember 6, 2002; and United States Provisional Patent 
Application Serial No. 60/407,165, filed on August 30, 
2002. 

[0002] The above stated application is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 

[0003] Certain embodiments of the present invention 
relate to processing of TCP data and related TCP infor- 
mation. More specif ically, certain embodiments relate to 
a method and system for TCP/IP offload independent of 
bandwidth delay product. 

BACKGROUND OF THE INVENTION 

[0004] The initial development of transmission control 
protocol (TCP) was based on networking and process- 
ing capabilities that were then currently available. As a 
result, various fundamental assumptions regarding its 
operation were prefaced on networking and processor 
technologies that existed at that time. Among the as- 
sumptions on which TCP was prefaced includes the 
scarcity and high cost of bandwidth and the partially lim- 
itless processing resources available by a host proces- 
sor. With the advent of technologies such as Gigabit Eth- 
ernet (GbE), these fundamental assumptions have rad- 
ically changed to the point where bandwidth is no longer 
as scarce and expensive and the host processing re- 
sources are now regarded a being limited rather than 
virtually infinite. In this regard, the bottleneck has shifted 
from the network bandwidth to the host processing 
bandwidth. Since host processing systems do more 
than merely providing faster network connections, shift- 
ing network resources to provide much faster network 
connections will do little to address the fundamental 
change in assumptions. Notably, shifting network re- 
sources to provide much faster network connections 
would occur at the expense of executing system appli- 
cations, thereby resulting in degradation of system per- 
formance. 

[0005] Although new networking architectures and 
protocols could be created to address the fundamental 
shift in assumptions, the new architectures and proto- 
cols would still have to provide support for current and 
legacy systems. Accordingly, solutions are required to 
address the shift in assumptions and to alleviate any 
bottlenecks that may result with host processing sys- 
tems. A transmission control protocol offload engine 



(TOE) may be utilized to redistribute TCP processing 
from the host system onto specialized processors which 
may have suitable software for handling TCP process- 
ing. The TCP offload engines may be configured to im- 
plement various TCP algorithms for handling faster net- 
work connections, thereby allowing host system 
processing resources to be allocated or reallocated to 
application processing. 

[0006] In order to alleviate the consumption of host 
resources, a TCP connection can be offloaded from a 
host to a dedicated TCP/IP offload engine (TOE). Some 
of these host resources may include CPU cycles and 
subsystem memory bandwidth. During the offload proc- 
ess, TCP connection state information is offloaded from 
the host, for example from a host software stack, to the 
TOE. A TCP connection can be in any one of a plurality 
of states at a given time. To process the TCP connec- 
tion, TCP software may be adapted to manage various 
TCP defined states. Being able to manage the various 
TCP defined states may require a high level of architec- 
tural complexity in the TOE. 

[0007] Offloading state information utilized for 
processing a TCP connection to the TOE may not nec- 
essarily be the best solution because many of the states 
such as CLOSING, LAST_ACK and FIN_WAIT_2 may 
not be performance sensitive. Furthermore, many of 
these non-performance sensitive states may consume 
substantial processing resources to handle, for exam- 
ple, error conditions and potentially malicious attacks. 
These are but some of the factors that substantially in- 
crease the cost of building and designing the TOE. In 
addition, a TOE that has control, transferred from the 
host, of all the state variables of a TCP connection may 
be quite complex, can use considerable processing 
power and may require and consume a lot of TOE on- 
board-memory. Moreover, the TCP connection offload- 
ed to the TOE that has control, transferred from the host, 
of all the state variables of the TCP connection can be 
inflexible and susceptible to connection loss. 
[0008] TCP segmentation is a technology that may 
permit a very small portion of TCP processing to be of- 
floaded to a network interface card (NIC). In this regard, 
a NIC that supports TCP segmentation does not truly 
incorporate a full transmission control processing of- 
fload engine. Rather, a NIC that supports TCP segmen- 
tation only has the capability to segment outbound TCP 
blocks into packets having a size equivalent to that 
which the physical medium supports. Each of the out- 
bound TCP blocks are smaller than a permissible TCP 
window size. For example, an Ethernet network inter- 
face card that supports TCP Segmentation, may seg- 
ment a 4KB block of TCP data into 3 Ethernet packets. 
The maximum size of an Ethernet packet is 1518 bytes 
inclusive of header and a trailing CRC. 
[0009] A device that supports TCP segmentation 
does track certain TCP state information such as the 
TCP sequence number that is related to the data that 
the offload NIC is segmenting. However, the device that 
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supports TCP segmentation does not track any state in- 
formation that is related to inbound traffic, or any state 
information that is required to support TCP acknowl- 
edgements or flow control. A NIC that supports full TCP 
offload in the established state is responsible for han- £ 
dling TCP flow control, and responsible for handling in- 
coming TCP acknowledgements, and generating out- 
bound TCP acknowledgements for incoming data. 
[001 0] TCP segmentation may be viewed as a subset 
of TCP offload. TCP segmentation allows the protocol 
stack or operating system to pass information in the form 
of blocks of TCP data that has not been segmented into 
individual TCP packets to a device driver. The block of 
data may be greater than the size of an Ethernet packet. 
For instance, the block of data to be segmented could 
4 Kbytes or 1 6 Kbytes. A network adapter associated 
with the device driver may acquire the blocks of TCP 
data, packetize the acquired blocks of TCP data into 
1 51 8-byte Ethernet packets and update certain fields in 
each incrementally created packet. For example, the 
network adapter may update a corresponding TCP se- 
quence number for each of the TCP packets by incre- 
menting the TCP sequence number for each of the 
packets. In another example, an IP identification (IP ID) 
field and flag field would also have to be updated for 
each packet. One limitation with TCP segmentation is 
that TCP segmentation may only be done on a block of 
data that is less than a TCP window size. This is due to 
the fact that a device implementing TCP segmentation 
has no influence over TCP flow control. Accordingly, the 
device implementing TCP flow control only segment 
outbound TCP packets. 

[001 1] A TCP segmentation device does not examine 
incoming packets and as such, has no influence over 
flow control. Any received acknowledgement packet is 
passed up to the host for processing. In this regard, ac- 
knowledgement packets that are utilized for flow control 
are not processed by the TCP segmentation device. 
Moreover, a TCP segmentation device does not perform 
congestion control or "slow-start" and does not calculate 
or modify any variables that are passed back to the op- 
erating system and/or host system processor. 
[0012] Another limitation with TCP segmentation is 
that information tracked by TCP segmentation is only 
information that is pertinent for the lifetime of the TCP 
data. In this regard, for example, the TCP segmentation 
device may track TCP segmentation numbers but not 
TCP acknowledgement (ACK) numbers. Accordingly, 
the TCP segmentation device tracks only a minimal sub- 
set of information related to corresponding TCP data. 
This limits the capability and/or functionality of the TCP 
segmentation device. A further limitation with TCP seg- 
mentation is that a TCP segmentation device does not 
pass TCP processed information back to an operating 
system and/or host processor. This lack of feedback lim- 
its the TCP processing that otherwise may be achieved 
by an operating system and/or host system processor. 
[0013] Further limitations and disadvantages of con- 



ventional and traditional approaches will become appar- 
ent to one of skill in the art, through comparison of such 
systems with some aspects of the present invention as 
set forth in the remainder of the present application with 
reference to the drawings. 

BRIEF SUMMARY OF THE INVENTION 

[0014] Aspects of the invention may be found in, for 
example, systems and methods that provide TCP/IP of- 
fload. In one embodiment of the invention, a system for 
TCP/IP offload may include, for example, a host and a 
TCP/IP offload engine (TOE). The host may be coupled 
to the TOE. The host may transfer control of at least a 
portion of TCP connection variables associated with the 
TCP connection to the TOE. The TOE may update at 
least a portion of the TCP connection variables and 
transfer or feedback the updated TCP connection vari- 
ables back to the host. 

[001 5] In accordance with another embodiment of the 
invention, a system is provided for TCP connection of- 
fload. The system may include, for example, a host and 
a network interface card (NIC) that may be coupled to 
the host. For a particular connection offloaded to the 
NIC, control of state information is split between the host 
and the NIC. Accordingly, information may be trans- 
ferred to the NIC and the NIC may update at least a por- 
tion of the transferred information. Subsequently, the 
NIC may transfer at least a portion of the updated infor- 
mation back to the host where the host may utilize this 
information to manage this and/or another connection. 
[0016] In another embodiment, the invention may pro- 
vide a method for TCP/IP offload. The method may in- 
clude, for example, one or more of the following: decid- 
ing to offload a particular TCP connection from a host 
to a TOE; transferring control of at least a portion of con- 
nection variables associated with the particular TCP 
connection from the host to the TOE; sending a snap- 
shot of remaining connection variables whose control 
was not transferred to the TOE; and managing the par- 
ticular TCP connection via the TOE using the connec- 
tion variables transferred to the TOE and/or using the 
snapshot. At least a portion of updated connection var- 
iables and/or snapshot variables associated with the 
TCP connection may be transferred back to the host for 
processing by the host. 

[0017] Another embodiment of TCP/IP offload meth- 
od may include, for example, one or more of the follow- 
ing: deciding to offload an established TCP connection 
from a host to a TOE; transferring control of segment- 
variant variables to the TOE from the host; sending a 
snapshot of segment-invariant variables and connec- 
tion-invariant variables to the TOE; and independently 
processing incoming TCP packets via the TOE based 
upon the segment-variant variables and the snapshot. 
The TOE may update at least a portion of the segment- 
variant variables and snapshot and transfer at least por- 
tions of the segment-variant variables and the snapshot 
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back to the host. In an embodiment of the invention, the 
host may handle all TCP states except possibly for the 
ESTABLISHED state which may be offloaded to the 
TOE. 

[0018] The invention may also include a method that 
processes a TCP connection, which may include, for ex- 
ample, one or more of the following: establishing the 
TCP connection; sharing a control plane for the TCP 
connection between a host and a TOE; and communi- 
cating updated TCP connection variables from the TOE 
back to the host. Accordingly, at least a portion of the 
updated TCP connection variables may be utilized to 
control the TCP connection and/or another TCP con- 
nection. 

[0019] In another embodiment of the invention, a 
method for TCP offload may include acquiring TCP con- 
nection variables from a host and managing at least one 
TCP connection using the acquired TCP connection 
variables. At least a portion of the acquired TCP con- 
nection variables may be updated and at least some of 
the updated TCP connection variables may be trans- 
ferred back to the host. The TCP connection variables 
may be independent of bandwidth delay product. At 
least a portion of the updated TCP connection variables 
may be utilized by the host to process the TCP connec- 
tion or another TCP connection. A stack may be utilized 
to transfer the TCP connection variables between at 
least the host and a TOE. In this regard, the TOE may 
pull the TCP connection variables from the stack and 
the host may push the TCP connection variables onto 
the stack. Also, the updated TCP connection variables 
may be placed on the stack by the TOE and the host 
may subsequently pull the updated TCP connection var- 
iables from the stack. 

[0020] The invention may also provide a machine- 
readable storage, having stored thereon, a computer 
program having at least one code section for providing 
TCP offload. The at least one code section may be ex- 
ecutable by a machine for causing the machine to per- 
form steps which may include acquiring TCP connection 
variables from a host and managing at least one TCP 
connection using the acquired TCP connection varia- 
bles. At least a portion of the acquired TCP connection 
variables may be updated and transferred back to the 
host. The TCP connection variables may be independ- 
ent of bandwidth delay product. The machine-readable 
storage may further include code for utilizing at least a 
portion of the updated TCP connection variables to 
process the TCP connection or another TCP connec- 
tion. In another aspect of the invention, the machine- 
readable storage may include code for pulling the TCP 
connection variables from a stack, code for pushing up- 
dated TCP connection variables onto the stack, and 
code for pulling connection variables from the stack. 

According to an aspect of the invention, a system 
for providing TCP/IP offload is provided, comprising: 

a host; and 



a TCP/IP offload engine (TOE) coupled to said host, 
wherein said host transfers control of at least a por- 
tion of TCP connection variables to said TOE. 

5 According to an aspect of the invention, a system 

for providing TCP/IP offload comprises: 

a host; and 

a TCP/IP offload engine (TOE) coupled to said host, 
10 wherein said host transfers control of at least a por- 
tion of TCP connection variables to said TOE and 
said TOE provides updated TCP variables back to 
said host. 

Advantageously, said host transfers control of 
segment-variant TCP connection variables to said TOE 
and said TOE provides updated TCP segment-variant 
TCP connection variables back to said host. 

Advantageously, said TCP segment-variant TCP 
connection variables further comprises: 

IP packet identifier; 
time remaining for retransmission; 
time remaining for delay acknowledgement; 
time remaining for keep alive; 
congestion window variables comprising: 

congestion window (SND_CWIN); and 
slow start threshold (SSTHRESH); 

round trip time variables comprising: 

smoothed round trip time (RTT); 
smoothed delta (DELTA); and 
time remaining for PUSH; and 

TCP state and timestamp send and receive se- 
quence variables comprising: 

sequence number for first un-ACK'd data 
(SNDJJNA); 

sequence number for next send (SNDJMXT); 
maximum sequence number ever sent 
(SNDJVIAX); 

maximum send window (MAX_WIN); 
sequence number for next receive 
(RCVJMXT); and 

receive window size (RCV_WND). 

Advantageously, said host provides said TOE with 
a snapshot of at least one of connection-invariant TCP 
connection variables and segment-invariant TCP con- 
nection variables, said TOE transferring updated at 
least one of connection-invariant TCP connection vari- 
ables and segment-invariant TCP connection variables 
back to said host; and 

wherein said TOE utilizes transferred segment- 
variant TCP connection variables and said snapshot of 
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said at least one of said connection-invariant TCP con- 
nection variables and said segment-invariant TCP con- 
nection variables to process at least one of incoming 
and outgoing TCP segments. 

Advantageously, said TOE utilizes said trans- 
ferred segment-variant TCP connection variables and a 
snapshot of at least one of connection-invariant TCP 
connection variables and segment-invariant TCP con- 
nection variables to independently process incoming 
TCP segments. 

Advantageously, said host provides said TOE with 
a snapshot of connection-invariant TCP connection var- 
iables and segment-invariant TCP connection varia- 
bles. 

Advantageously, said host provides said TOE with 
a snapshot of TCP connection variables that are not 
segment-variant TCP connection variables. 

Advantageously, said host posts at least one buff- 
er that is utilized by any TCP connection, a buffer that 
is dedicated to one or more TCP connections and an 
application buffer. 

Advantageously, said TOE manages said TCP 
connection. 

Advantageously, said TOE manages at least one 
of segmentation, acknowledgement processing, win- 
dowing and congestion avoidance. 

Advantageously, said TOE maintains exclusive 
read-write access to offloaded segment-variant varia- 
bles. 

Advantageously, said TOE exclusively updates of- 
floaded segment-variant variables. 

Advantageously, said host maintains read-write 
access to segment-invariant variables. 

Advantageously, said TOE has read-only access 
to segment-invariant variables. 

Advantageously, said host sends a message to 
said TOE concerning a change in a particular TCP con- 
nection variable whose control was not transferred to 
said TOE, and 

said TOE updates said particular TCP connection 
variable. 

Advantageously, said TCP connection is an ES- 
TABLISHED state. 

Advantageously, said TCP connection variables 
are TCP connection variables that are independent of 
bandwidth delay product. 

Advantageously, said host provides connection 
setups. 

Advantageously, the system provides resistance 
to DoS attacks by allowing said host to handle said con- 
nection setups. 

Advantageously, said host handles atl TCP states 
exclusive of an ESTABLISHED state which may be of- 
floaded to said TOE. 

Advantageously, said TOE handles only connec- 
tions that are in performance sensitive states. 

Advantageously, said host processes resource 
utilization statistics in helping to determine which con- 



nections to offload and which connections to upload. 

Advantageously, said host determines which con- 
nections to offload and which connections to upload. 

Advantageously, at least one of said TOE and a 
5 device driver software for said TOE determines at least 
one of TCP connections to offload and TCP connections 
to upload. 

According to an aspect of the invention, a system 
for providing connection offload comprises: 

10 

a host; and 

a network interface card (NIC) coupled to said host, 
wherein, for a particular connection offloaded to 
said NIC, control of state information is split be- 
15 tween said host and said NIC and said NIC uploads 
at least a portion of updated connection variables 
for said particular connection to said host. 

Advantageously, said particular connection em- 
20 ploys a connection -oriented transport layer protocol 
(TLP). 

Advantageously, said connection-oriented TLP 
comprises a TCP. 

Advantageously, said host transfers control of 
25 segment variant variables corresponding to said partic- 
ular connection to said NIC. 

According to an aspect of the invention, a method 
for providing TCP/IP offload comprises: 

30 deciding to offload a particular TCP connection from 
a host to a TOE; 

transferring control of connection variables of said 
particular TCP connection from said host to said 
TOE and transferring a snapshot of remaining con- 
35 nection variables whose control was not transferred 
to said TOE; 

managing said particular TCP connection via said 
TOE using said at least a portion of said connection 
variables transferred to said TOE and at least a por- 

40 tion of said snapshot; and 

updating at least a portion of said connection vari- 
ables and a portion of said snapshot and transfer- 
ring said updated at least said portion of said con- 
nection variables and said portion of said snapshot 

45 back to said host. 

Advantageously, said one or more connection var- 
iables of said particular TCP connection transferred to 
said TOE comprise at least one segment-variant varia- 
so bles of said particular TCP connection. 

Advantageously, said connection variables of said 
particular TCP connection transferred to said TOE lacks 
segment-invariant variables of said particular TCP con- 
nection. 

55 Advantageously, said connection variables of said 

particular TCP connection transferred to the TOE lacks 
segment-invariant variables and connection-invariant 
variables of said particular TCP connection. 
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Advantageously, the method further comprises: 

determining if at least one of said connection vari- 
ables controlled by said host have changed; 
notifying said TOE of changes in said at least one 
connection variables controlled by said host that 
has changed; and 

updating said connection variables in said TOE in 
accordance with said notified changes. 

According to an aspect of the invention, a method 
for providing TCP/IP offload comprises: 

deciding to offload an established TCP connection 
from a host to a TOE; 

transferring control of segment-variant variables to 
said TOE from said host; 

sending a snapshot of segment-invariant variables 
and connection-invariant variables to said TOE; 
independently processing incoming TCP packets 
via said TOE based upon said segment-variant var- 
iables and said snapshot; and 
updating at least a portion of said sent snapshot and 
at least a portion of said segment-variant variables 
and transferring at least a portion of said updated 
at least said portion of said sent snapshot and at 
least said portion of said updated segment-variant 
variables back to said host. 

According to an aspect of the invention, a method 
for processing a TCP connection comprises: 

establishing the TCP connection; and 
sharing a control plane for said TCP connection be- 
tween a host and a TOE; and 
communicating updated TCP connection variables 
from said TOE back to said host. 

Advantageously, said sharing of said control plane 
comprises transferring control of segment-variant vari- 
ables corresponding to said TCP connection to said 
TOE. 

Advantageously, the method further comprises 
uploading said TCP connection to said host from said 
TOE. 

Advantageously, uploading said TCP connection 
comprises transferring control of segment-variant vari- 
ables corresponding to said TCP connection to said 
host. 

Advantageously, the method further comprises of- 
floading said uploaded TCP connection to said TOE 
from said host. 

Advantageously, offloading said uploaded TCP 
connection comprises transferring said control of said 
segment-variant variables corresponding to said up- 
loaded TCP connection to said TOE. 

According to an aspect of the invention, a method 
for TCP offload comprises: 



acquiring TCP connection variables from a host; 
managing at least one TCP connection using said 
acquired TCP connection variables; 
updating at least a portion of said acquired TCP 
5 connection variables; and 

transferring said updated at least a portion of said 
acquired TCP connection variables back to said 
host. 

Advantageously, said TCP connection variables 
are independent of bandwidth delay product. 

Advantageously, the method further utilizes at 
(east a portion of said updated at least said portion of 
said acquired TCP connection variables to process said 
at least said at least one TCP connection by said host. 

Advantageously, the method further comprises 
pulling said TCP connection variables from a stack. 

Advantageously, the method further comprises 
pushing said updated at least a portion of said acquired 
TCP connection variables onto a stack. 

According to an aspect of the invention, a ma- 
chine-readable storage is provided, having stored ther- 
eon, a computer program having at least one code sec- 
tion for providing TCP offload, the at least one code sec- 
tion being executable by a machine for causing the ma- 
chine to perform steps comprising: 

acquiring TCP connection variables from a host; 
managing at least one TCP connection using said 
acquired TCP connection variables; 
updating at least a portion of said acquired TCP 
connection variables; and 
transferring said updated at least a portion of said 
acquired TCP connection variables back to said 
host. 

Advantageously, said TCP connection variables 
are independent of bandwidth delay product. 

Advantageously, the machine-readable storage 
further comprises code for utilizing at least a portion of 
said updated at least said portion of said acquired TCP 
connection variables to process said at least said at 
least one TCP connection by said host. 

Advantageously, the machine-readable storage 
further comprises code for pulling said TCP connection 
variables from a stack. 

Advantageously, the machine-readable storage 
further comprises code for pushing said updated at least 
a portion of said acquired TCP connection variables on- 
to a stack. 

[0021] These and other advantages, aspects and 
novel features of the present invention, as well as details 
of an illustrated embodiment thereof, will be more fully 
understood from the following description and drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0022] FIG. 1 is a block diagram of a system that pro- 
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vides TCP/IP offload in accordance with an embodiment 
of the invention. 

[0023] FIG. 2 is a flow chart illustrating exemplary 
steps for TCP/IP offloading in accordance with an em- 
bodiment of the invention . s 
[0024] FIG. 3 is a flow chart illustrating exemplary 
steps for providing TCP/IP offload in accordance with 
an embodiment of the invention. 
[0025] FIG. 4 is a flow chart illustrating exemplary 
steps that may be utilized for TCP offload in accordance 
with an embodiment of the invention. 

DETAILED DESCRIPTION OF THE INVENTION 

[0026] Certain aspects of the invention may provide 
a method for TCP offload, which may include acquiring 
TCP connection variables from a host and managing at 
least one TCP connection using the acquired TCP con- 
nection variables. At least a portion of the acquired TCP 
connection variables may be updated and at least some 
of the updated TCP connection variables may be trans- 
ferred back to the host. In accordance with an aspect of 
the invention, the TCP connection variables may be var- 
iables that are independent of bandwidth delay product. 
At least a portion of the updated TCP connection varia- 
bles may be utilized by the host to process the TCP con- 
nection or another TCP connection. A stack may be uti- 
lized to transfer the TCP connection variables between 
at least the host and a TOE. In this regard, the host may 
push the TCP connection variables onto the stack and 
the TOE may pull the TCP connection variables from 
the stack. Also, the updated TCP connection variables 
may be placed on the stack by the TOE and the host 
may subsequently pull the updated TCP connection var- 
iables from the stack. 

[0027] With regard to TCP segmentation, each of the 
outbound TCP blocks are smaller than a permissible 
TCP window size utilized for TCP segmentation. How- 
ever, the invention is not limited in this regard. Accord- 
ingly, in an aspect of the invention, a TOE device may 
have the capability to provide much further TCP 
processing and offload than a device that simply sup- 
ports TCP segmentation. Various aspects of the inven- 
tion may overcome the TCP segmentation limitation in 
which TCP segmentation may only be done on a block 
of data that is less than a TCP window size. In this re- 
gard, in order to overcome this limitation, in accordance 
with an aspect of the invention, since the TOE supports 
management of TCP flow control, the TOE may be 
adapted to segment large blocks of data down to the 
individual packets. The TOE may ensure that transmis- 
sions where scheduled such that the sender never sent 
data beyond the TCP window. Additionally, packetiza- 
tion in accordance with an embodiment of the invention 
may be done beyond the TCP window size. The TOE 
takes incoming received packets that are acknowledge- 
ment packets for the outbound TCP data stream and 
acknowledges those outbound packets. If the acknowl- 



edgement packet causes the window size to increase, 
then more packets may be sent out by the TOE device 
in accordance with an aspect of the invention. 
[0028] Although TCP segmentation is a transmit-only 
related technology that does limited TCP processing of 
transmitted packets, the TOE in accordance with vari- 
ous embodiments of the invention is not so limited. In 
this regard, the TOE in accordance with an embodiment 
of the invention may process and manage both trans- 
mitted and received packets. Furthermore, a much 
broader range of TCP processing and management 
may be done by the TOE in accordance with the inven- 
tion than with a TCP segmentation device. For example, 
with TOE, TCP information may be passed to a NIC from 
an operating system and/or host system processor in 
such a manner that the NIC maybe viewed as the owner 
of the TCP connection. The NIC may then manage and 
update the TCP state information, which may include 
but is not limited to, TCP segment numbers and ac- 
knowledgment numbers. Subsequent to the processing 
and/or updating of the TCP state information, the proc- 
essed and/or updated information may be passed back 
to an operating system and/or host system processor. 
The host or system processor may then utilize the infor- 
mation passed back to it from the NIC. Notably, TCP 
segmentation does not provide this feedback of infor- 
mation to the host system processor and/or operating 
system. 

[0029] Certain embodiments of the invention may al- 
so provide a robust and efficient transmission control 
protocol/internet protocol (TCP/IP) offload scheme that 
may be adapted, for example, to allow the partition of 
TCP processing between a TCP/IP offload engine 
(TOE) and a host TCP/IP implementation. The host 
TCP/IP implementation may include one or more host 
TCP/IP applications and one or more host processors. 
For example, in one aspect of the invention, the TCP 
offload scheme may offload the connections that are in 
an ESTABLISHED state to the TOE. In other words, as- 
pects of the invention may include the offloading of cor- 
responding TCP state variables that may be utilized, for 
example, during the ESTSABLISHED state. According- 
ly, the TCP/IP offload scheme may split a TCP control 
plane between the host software and the TOE. The TOE 
may be designed, for example, to implement a subset 
or a minimum subset of the TCP control plane which 
may be less complex to implement and may utilize less 
memory. The TOE, which may be adapted to such an 
offload scheme, may be implemented in a cost effective 
manner. The more complicated aspects of TCP connec- 
tion management may be handled, for example, by the 
host software and may provide greater reliability and 
flexibility. 

[0030] FIG. 1 is a block diagram of a system that pro- 
vides TCP/IP offload in accordance with an embodiment 
of the invention. Referring to FIG. 1, the system may 
include, for example, a host 10, host application soft- 
ware 12 and a TOE 20. The host 10 may include, for 
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example, a host CPU 30 and a host memory 40. The 
host memory 40 may be adapted to include, for exam- 
ple, an application buffer 50. The application buffer 50 
may be adapted to include, for example, a transmission 
application buffer (TxBuf) 60 and a receive application s 
buffer (RxBuf) 70. The TOE 20 may include, for exam- 
ple, a direct memory access (DMA) engine 25 and a 
FIFO buffer 70. 

[0031] The host 1 0 may be coupled to the TOE 20 via 
a host interface 80. The host interface may include, but w 
is not limited to a peripheral component interconnect 
(PCI) bus, PCI-X bus, ISA, SCSI or any other suitable 
bus. The TOE 20 may be coupled to a physical commu- 
nications medium 90. The physical communication me- 
dium 90 may be a wired medium, wireless medium or a 15 
combination thereof. The physical communication me- 
dium 90 may include, but is not limited to, Ethernet and 
fibre channel. Although illustrated on opposite sides of 
the host interface 80, the host 1 0 may be, at least in part, 
disposed on a network interface card (NIC) that includes 
the TOE 20. Accordingly, in an aspect of the invention, 
the TCP state plane may be split between the host 10 
and the TOE 20. 

[0032] In one embodiment, a TCP connection may be 
completely described, for example, by three different 
sets of variables. The three sets of variables may be, for 
example, connection-invariant variables, segment-in- 
variant variables and segment-variant variables. The 
connection-invariant variables may be constant during 
the lifetime of the TCP connection. The segment-invar- 
iant variables may not change from TCP segment to 
TCP segment, but may change from time to time during 
the lifetime of the TCP connection. The segment-variant 
variables may change from TCP segment to TCP seg- 
ment. 

[0033] Connection-invariant variables may include, 
for example, source IP address, destination IP address, 
IP time-to-live (TTL), IP type-of-service (TOS), source 
TCP port number, destination TCP port number, initial 
send sequence number, initial receive sequence 
number, send window scaling factor and receive window 
scaling factor. 

[0034] Segment-invariant variables may include, but 
are not limited to, source MAC address, next hop's MAC 
address, MAC layer encapsulation, effective maximum 
segment size, keep-alive intervals and maximum allow- 
ance and flags such as, for example, nagle algorithm 
enable and keep-alive enable. 
[0035] Segment-variant variables may include, but 
are not limited to, IP packet identifier; send and receive 
sequence variables such as, for example, sequence 
number for first un-acked data (SNDJJNA), sequence 
number for next send (SND_NXT), maximum sequence 
number ever sent (SND_MAX), maximum send window 
(MAX_WIN), sequence number for next receive 
(RCV_NXT) and receive window size (RCV_WND). Ad- 
ditional exemplary segment-variant variables may in- 
clude congestion window variables such as congestion 



window (SND_CWIN) and slow start threshold (SS- 
THRESH) round trip time variables which may include, 
but are not limited to, smoothed round trip time (RTT) 
and smoothed delta (DELTA). Other exemplary seg- 
ment-variant variables may include time remaining for 
retransmission, time remaining for delay acknowledge- 
ment, time remaining for keep alive, time remaining for 
PUSH and TCP state and timestamp. 
[0036] During operation, if a TCP connection is not of- 
floaded, then at least some of the three sets of variables 
including the connection-invariant variables, the seg- 
ment-invariant variables and the segment-variant vari- 
ables may be owned by the host software of the host 
1 0. If the TCP connection is not offloaded, then the TOE 
20 may not have access to these variables. However, 
once the variables are offloaded, the TOE 20 may be 
configured to update the variables which may be asso- 
ciated with both transmission and reception and pass 
the updated transmission and reception variables back 
to the host 1 0. In this regard, the TOE may update var- 
iables that are independent of TCP delay bandwidth 
product and pass these updated variables back to the 
host 10 for processing. 

[0037] FIG. 2 is a flow chart illustrating exemplary 
steps for TCP/IP offloading in accordance with an em- 
bodiment of the invention. Referring to FIG. 2, if a con- 
nection is offloaded to the TOE 20, then in step 202, the 
host software may transfer control of the segment-vari- 
ant variables to the TOE 20. In one example, a portion 
of the host software protocol control block or TCP con- 
trol block may be transferred to the TOE 20. In step 204, 
the host software may take a snapshot of the remaining 
variables such as the connection-invariant variables 
and/or the segment invariant variables and send the 
snapshot to the TOE 20. In one example, the snapshot 
may be used over and over again by the TOE 20. In step 
206, the host software may post a buffer in the host 
memory 40. For example, the host software may post 
the application buffer 50 in the host memory 40 and may 
set up the transmit application buffer (TxBuf) 60 and the 
receive application buffer (RxBuf) 70 in the application 
buffer 50. In step 208, the TOE 20 may be responsible 
for managing the complete TCP connection, including, 
for example, segmentation, acknowledgement process- 
ing, windowing and congestion avoidance. In step 210, 
at least a portion of the variables that have been updated 
may be transferred back to the host for processing. 
[0038] For example, by controlling the segment-vari- 
ant variables and using the snapshot of the remaining 
variables, the TOE 20 may process or independently 
process, incoming TCP segments from the physical 
communications medium 90 and may place at least a 
portion such as a payload, of the incoming TCP seg- 
ments into the host memory 40 via the DMA engine 25. 
In this regard the incoming TCP segment payload may 
be placed in the RX application buffer 70 portion of the 
application buffer 50 via the DMA engine 25. 
[0039] In one embodiment of the invention, while the 
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TOE 20 may be adapted to manage the complete TCP 
connection, the TOE 20 may have exclusive read-write 
access to offloaded segment-variant variables and may 
exclusively update the offloaded segment-variant vari- 
ables. The host software or host application software 1 2 s 
may have read- write access to the segment-invariant 
variables. The TOE 20 may have read-only access to 
the segment-invariant variables. If the host application 
software 12 changes the variables such as the next 
hop's MAC address, the host application software 12 
may notify the TOE 20 by, for example, sending a mes- 
sage to the TOE 20. The TOE 20 may then update the 
variables. The updated variables may be fed back to the 
host application software 1 2 where they may be utilized 
for TCP processing, for example. Accordingly, the con- 
nection-invariant variables may exist in both the host 
software and the TOE 20. 

[0040] FIG. 3 is a flow chart illustrating exemplary 
steps for providing TCP/IP offload in accordance with 
an embodiment of the invention. Referring to FIG. 3, in 
step 302, the host 1 0 may determine whether one or 
more of the connection variables such as the segment- 
invariant variables controlled by the host 10 have 
changed. For example, the host software may change 
one or more variables such as a next hop MAC address. 
If one or more of the connection variables controlled by 
the host 10 are not changed, then the process may be 
complete. If one or more of the connection variables 
controlled by the host 1 0 are changed, then, in step 304, 
the host software may notify the TOE 20 of the change 
in the one or more connection variables controlled by 
the host 10. In step 306, the TOE 20 may accordingly 
update one or more of the variables. In step 308, the 
TOE may pass the updated variables back to the host 
10. 

[0041] Some embodiments according to the present 
invention may include one or more of the following ad- 
vantages. Some embodiments may be more reliable 
and may provide for the uploading of connection from 
the TOE to the host and offloading of connections from 
the host to the TOE at any time. Since less state infor- 
mation may be kept by the TOE hardware, uploading 
and offloading, for example, selected connections can 
be accelerated. An offloaded connection may be up- 
loaded by returning control of, for example, the seg- 
ment-variant variables corresponding to the offloaded 
connection back to the host 10. The uploaded connec- 
tion may subsequently be offloaded by transferring, for 
example, the control of the segment-variant variables 
corresponding to the uploaded connection to the TOE 
20. 

[0042] FIG. 4 is a flow chart illustrating exemplary 
steps that may be utilized for TCP offload in accordance 
with an embodiment of the invention. Referring to FIG. 
4, in step 402, a TOE may acquire or receive variables 
that are independent of the bandwidth delay product 
from a host system. In step 404, the TOE may manage 
the connection utilizing the acquired or received varia- 



bles that are independent of the bandwidth delay prod- 
uct. In step 406, the TOE may update at least a portion 
of the acquired variables that are independent of the 
bandwidth delay product. In step 408, at least a portion 
of the updated variables that are independent of the 
bandwidth may be transferred back to the host. In step 
410, the host may utilize the updated variables that are 
independent of the bandwidth delay product that have 
been transferred to it for TCP processing. 
[0043] In accordance with an aspect of the invention, 
a stack 1 4 may be utilized to facilitate the transfer of the 
variables that are independent of the bandwidth delay 
product. The stack 1 4 may be implemented in hardware, 
software or a combination thereof. Notwithstanding, the 
TOE may be adapted to pull information from the stack 
14 and to push updated information onto the stack 14. 
The host may also be adapted to push TCP information 
onto the stack 14 and to pull the updated information 
from the stack 14. Accordingly, with reference to step 
402, the TOE may pull the variables that are independ- 
ent of the bandwidth delay product from the stack 14. 
With reference to step 406, after the TOE updates the 
acquired variables that are independent of the band- 
width delay product, the updated variables that are in- 
dependent of the bandwidth delay product may be 
pushed onto the stack 14. In this regard, with reference 
to step 408, the host may then pull the updated variables 
that are independent of the bandwidth delay product 
from the stack 14. 

[0044] The TOE may provide a more flexible ap- 
proach to TCP processing compared to a TCP Segmen- 
tation offload deice, since the TOE device may facilitate 
TCP processing on both the received side and the trans- 
mit side. Additionally, since the TOE may be adapted to 
handle receive and transmit variables, the TOE provides 
a more flexible and efficient methodology for supporting 
the efficient setup and tear down of network connec- 
tions. 

[0045] Certain embodiments of the invention may of- 
fer better resistance against denial-of-service (DoS) at- 
tacks or other attacks as connection setup may be han- 
dled by a host that is more flexible and more powerful 
than the TOE NIC. In a DoS attack, an attacker attempts 
to consume as many resources on the targeted or at- 
tacked system, thereby preventing the targeted system 
from providing services to other network devices. The 
frequent introduction of new attacks may make a flexible 
host with sufficient memory and CPU power a better 
choice for running connection setup. The flexible host 
may be a better choice than, for example, a particular 
hardware TOE that may have limited code space, com- 
puter power, system knowledge and flexibility. In addi- 
tion, the decision to honor a connection request may, at 
times, be based upon, for example, sophisticated and 
dynamic heuristics. 

[0046] Aspects of the invention may also provide bet- 
ter overall system performance and efficiency. The TOE 
NIC may be more efficient in handling, for example, con- 



15 



20 



25 



30 



35 



40 



45 



50 



9 



17 



EP1 513 321 A2 



18 



nections that are in performance sensitive states of the 
TCP state machine. In particular, when the TOE NIC 
handles only connections that are in performance sen- 
sitive states of the TCP state machine, additional limited 
hardware resources may become available. According- 5 
ly, the TOE NIC may be adapted to upload connections 
that are no longer in performance sensitive states and 
to offload connections that are in performance sensitive 
states. Such actions may positively impact such figures 
of merit such as, for example, hardware TOE efficiency. 
Other aspects of the invention may be more efficient and 
may provide better over all system performance be- 
cause, for example, the host may use flexible, changing, 
easy-to-update, easy-to-upgrade and more sophisticat- 
ed algorithms to decide which connections to offload or 
to upload. 

[0047] Some embodiments according to the present 
invention may provide statistics to the host relating to 
resource utilization. The statistics may include, for ex- 
ample, one or more of the following: available resourc- 
es; utilization of bandwidth per offloaded connection; 
number of frames per offloaded connection; errors per 
offloaded connection; change of state of a transport lay- 
er protocol (TLP) such as, for example, TCP, or an upper 
layer protocol (ULP); trend of utilization such as uptake 
in rate, slow down, for example; and resource consump- 
tion per offloaded connection. The host may use the sta- 
tistical information at its own discretion to help drive the 
upload or offload decision process. For example, the 
host may utilize the statistical information to upload 
some connections while offloading others. The host may 
also contemplate other criteria such as modes of oper- 
ation, computation or network load profiles, presently 
executed applications and roles in the network, for ex- 
ample. Some of these criteria may be dynamic criteria. 
[0048] Certain embodiments of the invention may al- 
so provide fail-over support from a failed TOE NIC to an 
operating TOE NIC. Fail-over may include, for example, 
designating a NIC as having failed when the network 
cable is unplugged from the network or any other failure 
of an existing network link. Thus, even though the hard- 
ware of one TOE NIC may fail, the connection may still 
be maintained by transferring state information associ- 
ated with the failed TOE NIC to another functional TOE 
NIC. The robustness of the transfer may be further en- 
hanced by part of the connection state information being 
maintained by the host and part of the connection state 
information being maintained by the TOE NIC. 
[0049] Accordingly, the present invention may be re- 
alized in hardware, software, or a combination of hard- 
ware and software. The present invention may be real- 
ized in a centralized fashion in one computer system or 
in a distributed fashion where different elements are 
spread across several interconnected computer sys- 
tems. Any kind of computer system or other apparatus 
adapted for carrying out the methods described herein 
is suited. A typical combination of hardware and soft- 
ware may be a general-purpose computer system with 



a computer program that, when being loaded and exe- 
cuted, controls the computer system such that it carries 
out the methods described herein. 
[0050] The present invention also may be embedded 
in a computer program product, which comprises all the 
features enabling the implementation of the methods 
described herein, and which when loaded in a computer 
system is able to carry out these methods. Computer 
program in the present context means any expression, 
in any language, code or notation, of a set of instructions 
intended to cause a system having an information 
processing capability to perform a particular function ei- 
ther directly or after either or both of the following: a) 
conversion to another language, code or notation; b) re- 
production in a different material form. 
[0051] While the present invention has been de- 
scribed with reference to certain embodiments, it will be 
understood by those skilled in the art that various chang- 
es may be made and equivalents may be substituted 
without departing from the scope of the present inven- 
tion. In addition, many modifications may be made to 
adapt a particular situation or material to the teachings 
of the present invention without departing from its 
scope. Therefore, it is intended that the present inven- 
tion not be limited to the particular embodiment dis- 
closed, but that the present invention will include all em- 
bodiments falling within the scope of the appended 
claims. 



Claims 

1. A system for providing TCP/IP offload, comprising: 

35 a host; and 

a TCP/IP offload engine (TOE) coupled to said 
host, wherein said host transfers control of at 
least a portion of TCP connection variables to 
said TOE. 

40 

2. The system according to claim 1 , wherein and said 
TOE provides updated TCP variables back to said 
host 

45 3. The system according to claim 1 or 2, wherein said 
host transfers control of segment-variant TCP con- 
nection variables to said TOE and said TOE pro- 
vides updated TCP segment-variant TCP connec- 
tion variables back to said host. 

50 

4. A system for providing connection offload, compris- 
ing: 

a host; and 

55 a network interface card (NIC) coupled to said 

host, wherein, for a particular connection of- 
floaded to said NIC, control of state information 
is split between said host and said NIC and said 
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NIC uploads at least a portion of updated con- 
nection variables for said particular connection 
to said host. 

5. A method for providing TCP/IP offload, comprising: 5 

deciding to offload a particular TCP connection 
from a host to a TOE; 

transferring control of connection variables of 
said particular TCP connection from said host 10 
to said TOE and transferring a snapshot of re- 
maining connection variables whose control 
was not transferred to said TOE; 
managing said particular TCP connection via 
said TOE using said at least a portion of said '5 
connection variables transferred to said TOE 
and at least a portion of said snapshot; and 
updating at least a portion of said connection 
variables and a portion of said snapshot and 
transferring said updated at least said portion 20 
of said connection variables and said portion of 
said snapshot back to said host. 

6. The method according to claim 5, wherein said one 

or more connection variables of said particular TCP 25 
connection transferred to said TOE comprise at 
least one segment-variant variables of said partic- 
ular TCP connection. 

7. A method for providing TCP/IP offload, comprising: 30 

deciding to offload an established TCP connec- 
tion from a host to a TOE; transferring control 
of segment-variant variables to said TOE from 
said host; sending a snapshot of segment-in- 35 
variant variables and connection-invariant var- 
iables to said TOE; 

independently processing incoming TCP pack- 
ets via said TOE based upon said segment-var- 
iant variables and said snapshot; and *o 
updating at least a portion of said sent snapshot 
and at least a portion of said segment-variant 
variables and transferring at least a portion of 
said updated at least said portion of said sent 
snapshot and at least said portion of said up- *5 
dated segment-variant variables back to said 
host. 

8. A method for processing a TCP connection, com- 
prising: so 

establishing the TCP connection; and 
sharing a control plane for said TCP connection 
between a host and a TOE; and 
communicating updated TCP connection vari- 55 
ables from said TOE back to said host. 

9. A method for TCP offload, the method comprising: 



acquiring TCP connection variables from a 
host; 

managing at least one TCP connection using 
said acquired TCP connection variables; 
updating at least a portion of said acquired TCP 
connection variables; and 
transferring said updated at least a portion of 
said acquired TCP connection variables back 
to said host. 

10. A machine-readable storage, having stored there- 
on, a computer program having at least one code 
section for providing TCP offload, the at least one 
code section being executable by a machine for 
causing the machine to perform steps comprising: 

acquiring TCP connection variables from a 
host; 

managing at least one TCP connection using 
said acquired TCP connection variables; 
updating at least a portion of said acquired TCP 
connection variables; and 
transferring said updated at least a portion of 
said acquired TCP connection variables back 
to said host. 
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